Ignasi Bosch

Here's something that took me years to realize: most software bugs don't come from what you wrote wrong. They come from what you assumed was right.

I'm talking about those moments when you stare at a failing system and think, "This should work." The code looks correct. The tests pass. The logic is sound. But somewhere, buried in the foundation of your mental model, there's an assumption that just... isn't true anymore. Maybe it never was.

And now we have AI coding assistants that can generate sophisticated implementations in seconds. Which should make assumptions less dangerous, right? Wrong. It makes them more dangerous, because AI can build on your flawed assumptions faster than you can question them.

What We Mean When We Say "Assumption"

Let's be precise about this. An assumption is an inference you make based on premises you think you know. You've got some knowledge in your head — call it your internal knowledge base — and from that, you conclude something about how the system should behave.

The problem isn't the inference. Human reasoning is pretty good at connecting dots. The problem is with those premises. The stuff you "know" might be completely speculative. Or worse — it might have been true once but isn't anymore.

Here's the kicker: the only way to know if a premise is actually true is to check it. And even then, you need to keep checking it, because systems change. What was true yesterday might be false today. But we keep acting on old assumptions because checking takes effort, and who has time for that?

This is exactly how LLMs behave, by the way. Ask Claude a question, and you get an answer — not necessarily accurate, not necessarily consistent with the previous answer, just the most plausible response based on patterns it's seen before. Same energy. And it's why working with AI makes assumption management even more critical.

The Cognitive Shortcuts We Can't Avoid

In a perfect world, we'd know everything about our systems from the hardware up to the UI. But perfect worlds are boring, and real systems are impossibly complex. Like LLMs, our context window is limited. We can't hold the entire stack in our heads simultaneously.

So we do what any reasonable person would do: we create shortcuts. High-level mental models that hide the complexity beneath. "The database is fast." "Network calls are reliable." "Third-party libraries work as advertised."

These shortcuts aren't laziness — they're survival. Without them, we'd be paralyzed by the sheer volume of things we'd need to verify before writing a single line of code. The problem is that shortcuts become assumptions, and assumptions become invisible dependencies your system relies on.

I once spent an entire week debugging what seemed like a completely random failure in a payment processing system. The error only happened sporadically, always with the same mysterious message, and I was going genuinely insane trying to reproduce it. Turns out our "reliable" third-party payment API had a rate limit that kicked in during peak hours, but only returned a helpful error message to paying enterprise customers. We were on the free tier.

My assumption: "Third-party APIs tell you what's wrong when things break." Reality: "Third-party APIs tell paying customers what's wrong when things break."

The Test Paradox

"Just write tests," you might say. "Tests turn assumptions into verified facts."

True. Tests are one of our best tools for replacing assumptions with proof. Instead of assuming your function handles edge cases, you can demonstrate it. Instead of assuming the API returns what you expect, you can capture its actual behavior.

But here's the paradox: tests themselves are built on assumptions.

When you test a function in isolation, you're assuming the function's dependencies work correctly. When you use a testing framework, you're assuming the framework itself doesn't have bugs. When you mock external services, you're assuming your mocks accurately represent real-world behavior.

Most of the time, these assumptions are safe. Testing frameworks are battle-tested, and mocking well-known APIs is usually reliable. But "usually" and "always" are different words, and the gap between them is where the most insidious bugs live.

The worst-case scenario isn't when nothing works — it's when things work most of the time. Intermittent failures break our assumption-based mental models because they violate the basic premise that computers are consistent. If the same input produces different outputs, something else is interfering with your system, and that something could be at any level of the stack.

The Hidden State Problem

The most dangerous assumptions are the ones about state. Not just database state or application state, but the entire context your code runs in. The assumption that files exist. That network connections are stable. That other services are running. That the system clock is accurate.

I learned this lesson the hard way when a scheduled job that had run reliably for months silently didn't fire one Sunday in March. The code hadn't changed. There were no error messages — the job simply didn't run. After hours of debugging, I discovered the issue: our server was in a timezone that observed daylight saving time, and the job was scheduled for 2 AM. On spring-forward Sunday, clocks jump straight from 1:59 AM to 3:00 AM. The job's trigger time never existed that night.

My assumption: "Time moves forward consistently." Reality: "Time is a social construct with politically motivated exceptions."

This is why microservices can be so brittle despite their architectural elegance. Each service makes assumptions about the availability and behavior of every other service it depends on. When those assumptions are wrong, the failure modes can be spectacular and nearly impossible to diagnose.

The AI Amplification Effect

Working with AI coding assistants has taught me something fascinating about assumptions: they're incredibly contagious. When you ask an AI to implement something, it doesn't just generate code — it generates code based on the assumptions embedded in your prompt.

"Create a function that processes user data" assumes there is user data to process. "Add error handling for network failures" assumes you've correctly identified which failures need handling. "Optimize this database query" assumes the query is actually the bottleneck.

The AI doesn't question your assumptions. It can't. It builds on them, creating sophisticated implementations that inherit all of your flawed premises. The code looks professional, follows best practices, and fails in ways you never anticipated because the failure is in the assumption layer, not the implementation layer.

This isn't a criticism of AI — it's a recognition that assumption validation has become a more important skill, not less important. When you can generate working code quickly, the bottleneck shifts from implementation to problem definition. Getting the problem wrong gets you the wrong solution faster.

The Meta-Skill: Assumption Detection

The real skill isn't avoiding assumptions — that's impossible. The real skill is learning to identify when something is an assumption, especially when it's disguised as fact.

Some assumptions are easy to spot because they're explicitly provisional: "I think this API returns JSON." "Probably no one will upload files bigger than 100MB." "The database should handle this load."

But the dangerous ones masquerade as knowledge: "Users always provide valid email addresses." "This library is thread-safe." "The filesystem has enough space." These feel like facts because they're based on experience, documentation, or conventional wisdom. But experience can be incomplete, documentation can be wrong, and conventional wisdom can be outdated.

The question I've learned to ask: "What would have to be true for this to work?" Then I try to identify which of those things I've verified versus which ones I've simply assumed.

Building Systems That Make Assumptions Explicit

The best defenses against assumption-based failures aren't perfect knowledge — they're systems designed to surface assumptions and handle their failure gracefully.

Input validation makes assumptions about data explicit. Instead of assuming users provide valid input, you check and handle invalid cases. Retry logic with exponential backoff assumes network failures are temporary but degrades gracefully when they're not. Circuit breakers assume downstream services might be unavailable and provide alternatives.

Health checks, monitoring, and observability are all ways of making system state visible instead of assumed. Instead of assuming your dependencies are working, you actively monitor them and react when they're not.

Even configuration can help. Instead of hardcoding assumptions about file paths, database URLs, or timeout values, you make them explicit configuration that can be changed when your assumptions prove wrong.

The goal isn't to eliminate assumptions — it's to make them visible, testable, and recoverable when they break.

The Temporal Nature of Truth

Here's the insight that changed how I think about system design: assumptions aren't just right or wrong — they have expiration dates.

"The cache is fast" might be true when you have 1,000 users and false when you have 100,000. "This API endpoint is stable" might be true until the provider decides to deprecate it. "Our users are mostly in the US" might be true until your marketing campaign in Europe succeeds.

Systems evolve, requirements change, and the premises that made your assumptions valid can shift under your feet. The assumption that was perfectly reasonable six months ago can become the source of today's production incident.

This is why documentation goes stale so quickly. It's not that people forget to update it — it's that the assumptions documented as facts yesterday became false realities today, and nobody realized the dependency.

Living With Uncertainty

Software development is fundamentally about managing uncertainty with incomplete information. We build systems to solve problems we don't fully understand, for users whose behavior we can't predict, on infrastructure we don't completely control.

Assumptions are how we function despite this uncertainty. They're not a bug in human reasoning — they're a feature. The key is treating them as hypotheses rather than facts, and building systems that can adapt when our hypotheses prove wrong.

The code is still the output. But increasingly, the real work is in examining the assumptions that shape what code we write, and designing systems that can survive when those assumptions break.

Because they will. The only question is whether you'll be ready.

{ignasibosch .com}

Articles

The Assumptions That Break Systems